F0 contour and segmental duration modeling using prosodic features
نویسندگان
چکیده
This paper proposes a framework of F0 contour generation and segmental duration modeling for application in a unit-selection speech synthesis system for Polish – BOSS. We describe the design of the F0 and duration modeling modules and emphasize the role of prosodic features (related to stress, pitch accent and phrase) in these two tasks.
منابع مشابه
Noise Robust Speech Recognition Using Prosodic Information
This paper proposes a noise robust speech recognition method for Japanese utterances using prosodic information. In Japanese, the fundamental frequency (F0) contour conveys phrase intonation and word accent information. Consequently, it also conveys information about prosodic phrase and word boundaries. This paper first proposes a noise robust F0 extraction method using the Hough transform, whi...
متن کاملDependency analysis of read Japanese sentences using pause and F0 information: a speaker independent case
This paper deals with the problem of exploiting prosodic information in syntactic analysis of sentences. Duration of pauses at phrase boundaries and relative F0 contour features have been found to be effective for parsing in speaker-dependent case. In this paper, effectiveness of pause and F0 information was examined in a speakerindependent manner by using prosodic features extracted from the s...
متن کاملWord Prominence Detection using Robust yet Simple Prosodic Features
Automatic detection of word prominence can provide valuable information for downstream applications such as spoken language understanding. Prior work on automatic word prominence detection exploit a variety of lexical, syntactic, and prosodic features and model the task as a sequence labeling problem (independently or using context). While lexical and syntactic features are highly correlated wi...
متن کاملCorpus-based generation of prosodic features from text based on generation process model
A total scheme of generating prosodic features from a text input was constructed. The method consists of corpus-based prediction of pauses, phone durations and fundamental frequencies (F0's), in this order, and information predicted in an earlier process is utilized in the following processes. Since prediction of F0's is done on the command values of F0 contour generation process model instead ...
متن کاملNoise robust speech recognition using F0 contour extracted by hough transform
This paper proposes a noise robust speech recognition method using prosodic information. In Japanese, fundamental frequency (F0) contour represents phrase intonation and word accent information. Consequently, it conveys information about prosodic phrase and word boundaries. This paper first proposes a noise robust F0 extraction method using Hough transform, which achieves high extraction rates ...
متن کامل